Gene Prediction by Pattern Recognition and Homology Search

نویسندگان

  • Ying Xu
  • Edward C. Uberbacher
چکیده

This paper presents an algorithm for combining pattern recognition-based exon prediction and database homology search in gene model construction. The goal is to use homologous genes or partial genes existing in the database as reference models while constructing (multiple) gene models from exon candidates predicted by pattern recognition methods. A unified framework for gene modeling is used for genes ranging from situations with strong homology to no homology in the database. To maximally use the homology information available, the algorithm applies homology on three levels: (1) exon candidate evaluation, (2) gene-segment construction with a reference model, and (3) (complete) gene modeling. Preliminary testing has been done on the algorithm. Test results show that (a) perfect gene modeling can be expected when the initial exon predictions are reasonably good and a strong homology exists in the database; (b) homology (not necessarily strong) in general helps improve the accuracy of gene modeling; (c) multiple gene modeling becomes feasible when homology exists in the database for the involved genes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Expression analyses of endoglucanase gene in Penicillium oxalicum and Trichoderma viride

The expression of endoglucanase gene and protein profile belonging to two fungal species, Penicillium oxalicum 1SMS and Trichoderma viride 156MS with high cellulase enzyme activity, was investigated. Fungal isolates were cultured on inducer CMC medium and then the amount of released sugar and protein were assayed every three days for a month, using arsenate molybdatereagent and Bradford method,...

متن کامل

Bioinformatics Analysis of Upstream Region and Protein Structure of Fungal Phytase Gene

Phytase increases the bioavailability of phytate phosphorus in seed-based animal feeds and reduces the phosphorus pollution of animal waste. Since most animal feeds for pellets are heated up to 65-80 °C, the production of a thermostable structure for phytase can be useful. In this study, we sought to perform bioinformatics analysis of the upstream region and protein structure of fungal phytase ...

متن کامل

An Editing Environment for DNA Sequence Analysis

This paper presents a computer system for analyzing and annotating large-scale genomic sequences. The core of the system is a multiple-gene structure identi cation program, which predicts the most \probable" gene structures based on the given evidence, including pattern recognition, EST and protein homology information. A graphics-based user interface provides an environment which allows the us...

متن کامل

Maximum Likelihood Estimation of Weight Matrices for Targeted Homology Search

Genome annotation relies to a large extent on the recognition of homologs to already known genes. The starting point for such protocols is a collection of known sequences from one or more species, from which a model is constructed – either automatically or manually – that encodes the defining features of a single gene or a gene family. The quality of these models eventually determines the succe...

متن کامل

Comparison of the Lipophosphoglycan 3 Gene of the Lizard and Mammalian Leishmania: A Homology Modeling

Background: Lipophosphoglycan 3 (LPG3) is required for the LPG assembly, a well known virulent molecule. In this study, the LPG3 gene of the lizard and mammalian Leishmania species were cloned and sequenced. A three-dimensional structure (3D) for the target sequence was also predicted by comparative (homology) modeling. Materials and Methods: An optimization PCR amplification was performed o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings. International Conference on Intelligent Systems for Molecular Biology

دوره 4  شماره 

صفحات  -

تاریخ انتشار 1996